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Abstract 

Segmentation of injured or unusual anatomic structures in medical imagery is a problem that has 
continued to elude fully automated solutions. In this paper, the goal of easy-to-use and consistent 
interactive segmentation is transformed into a control synthesis problem. A nominal level set PDE 
is assumed to be given; this open-loop system achieves correct segmentation under ideal 
conditions, but does not agree with a human expert's ideal boundary for real image data. 
Perturbing the state and dynamics of a level set PDE via the accumulated user input and an 
observer-like system leads to desirable closed-loop behavior. The input structure is designed such 
that a user can stabilize the boundary in some desired state without needing to understand any 
mathematical parameters. Effectiveness of the technique is illustrated with applications to the 
challenging segmentations of a patellar tendon in MR and a shattered femur in CT. 

Index Terms 

PDE Control; Reaction-Diffusion; Human-Computer-Interaction; Level Set Methods; Magnetic 
Resonance Imaging; Computed Tomography; Image Segmentation 

I. Introduction 

Microwave-frequency Magnetic-Resonance-Imaging (MRI) [1] and X-Ray Computed 
Tomography (CT) [2] yield three-dimensional volumetric images which are viewed by a 
medical professional for diagnosis, treatment planning, or population studies [3], [4]. 
Typically, only a particular anatomic region or organ is of interest and must be segmented. 
Segmentation is the task of identifying and localizing salient structures in the image volume. 
Since there is an abundance of raw image data that is not analyzed due to the infeasible time 
and cost of manual segmentation, automated methods for segmentation are the subject of 
much recent medical computing literature [5]-[7]. However, a human expert's ability to 
combine observed image data with prior anatomical knowledge to accurately perform 
segmentation is unmatched by computer algorithms. There is substantial mistrust both from 
patients and doctors towards fully automatic algorithms. Recognizing this, there has been a 
recent drive towards interactive segmentation [8]-[10]. Interactive approaches use a data- 
driven automatic algorithm to process a majority of the volume. As the automatic 
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segmentation runs and displays the current state, a human user can influence the algorithm's 
behavior to more closely align with an expected result. Fig-1 shows an example of the 
segmentation corrections a human would make. Ideally, an interactive system should enable 
a user to create excellent segmentation results with a minimal amount of time and effort. 

Interactive medical image segmentation employs software tools such as Seg3D [11] and 3D- 
Slicer [12] for applying algorithms and visualizing the results; a human expert uses the 
algorithms to achieve a segmentation that is as close as possible to the ideal region 
boundary. Available algorithms include iterative methods with a concept of time; a partial 
differential equation (PDE) is used to model the space-time relationship between image data 
and segmentation boundaries [13]— [16]. Given user-specified parameters and initial state, 
incremental modifications are made to the segmentation and shown to the user [17]-[20]. It 
is not clear to a user whether it is possible for their ideal region boundary to be a steady- 
state. In practice, they must stop the algorithm at some time tf when the segmentation is 
reasonably accurate, then apply smoothing and manual corrections to the boundary. An 
alternative approach is to formulate segmentation as a time-independent problem; in [21]- 
[23], user input acts as a constraint in finite-dimensional nonlinear optimization problems. It 
is not known a-priori whether the user's ideal region boundary is a feasible solution for some 
collection of constraints, while a changing number of user input constraints can affect the 
computational complexity. Furthermore, the time-independent formulation can lead to large 
changes in the segmentation output when new user input is received. Other classes of 
algorithms such as graph-cuts [19], [24], [25] are also effective for automated segmentation; 
we consider only level-set PDE algorithms due to their theoretical compatibility with 
methods in the PDE control literature. 

Level-set methods define a region boundary implicitly as the zero level-set of a function ^(x, 
t) with domain 1 OCR". Temporal changes of the region boundary are described by a 
partial differential equation (PDE) as a consequence of the implicit representation. Typically 

arises as the gradient flow that minimizes some meaningful functional of </> and the 
image data I(x). From a controls perspective, the image-dependent PDE is an open-loop 
system. We present a framework for interactive segmentation using feedback augmentation 
of a level set PDE system; the results and theory are a substantial extension of the 
preliminary version [26]. This paper is motivated by the following observation: when human 
users influence the level-set evolution, they have in mind a desired reference state and are 
trying to apply control to an image -dependent PDE system. 

In the literature on control of PDE systems, two characteristics of problems are the domain 
of input actuation and the available measurement (pointwise throughout a domain, 
boundary- value only, or as an integral over space). Control through region boundaries is of 
paramount interest; [27] characterizes the stabilizing controls and admissible boundary 
conditions for a class of unstable reaction-diffusion (R-D) systems. Similarly, [28] explicitly 
computes invariant regions for coupled R-D systems. Stabilization of the viscous Burgers 
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equation Ui ^~^~2^^~ eUxx usm g boundary actuation is achieved in [29], [30] by first 
designing a feedback law using u(x, t) over all x, then deriving a w-observer that uses only 

d 1 2 

boundary measurements. In [31], the in viscid Ut ^~^X~2 l ) = 0 is stabilized with a boundary 
input w(0, i) from an admissible set of controls that admits a weak solution to the initial- 
boundary- value problem. A common theme in PDE-control for setting input and gain values 
are scalar functions w(t) defined as functional of the state u(x, t) [32]-[34]; e.g., w(t) = Sq 
k(x, t)u(x, t)dx. Such inputs appear in this paper as well, with image dependence entering via 
a term analogous to k(x, t). The model used in this paper uses actuation and measurement 
within a neighborhood of the time- varying segmentation boundary. 

Contribution 

Using a PDE formulation guarantees that the computational complexity is fixed and that the 
segmentation result changes continuously over time. By incorporating control-theoretic 
tools, it is shown that the steady state segmentation can be driven to an ideal reference 
boundary. Input from the human user indicating locations where the current state does not 
match the desired reference state is processed and used as feedback in the PDE. An 
approach similar to backstepping [33], [35] is used; first, stability of a labeling error 
functional is shown under the assumption of a known reference state. Second, an auxiliary 
observer-like system that reacts to user input is formulated. The net coupled system is shown 
to have bounded error when a sufficient amount of user input has been accumulated. To the 
best of our knowledge, this is the first approach to interactive level set segmentation with 
input from the user used in feedback to guarantee stabilization about a reference boundary. 

Organization 

The remainder of this paper is as follows. 

Image segmentation based on the narrow-band level- set method is reviewed in Section II. 
Following an approach similar to back- stepping, a controller is proposed in Section III that 
stabilizes a labeling-error term, assuming exact knowledge of a reference state. An auxiliary 
observer-like system is designed in Section IV to process a user input signal and estimate the 
reference state. Application of the technique to interactive segmentation of CT and MRI 
volumes is demonstrated in Section V, using images that are difficult to segment with 
existing methods. The proposed algorithm is compared to related methods in Section VI. 
Section VII summarizes the results and applicability of the approach. Key components of 
the final closed-loop system and corresponding paper sections are visualized in Fig-2. 

II. Level Sets and Automated Segmentation 

The class of open-loop systems considered in this work are PDE-based segmentation 
algorithms using the popular level set method, reviewed in Section II-A. Section II-B 
describes the limitations of open-loop segmentation, thus motivating the feedback control 
model of interactive image segmentation in Section III-A. 
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A. Review of Level Set Methods 

Level set methods represent time- varying region boundaries in a computationally straight- 
forward manner [36], [37]. Define Q C W 1 to be the spatial domain and x a coordinate in Q. 
Labeling assignments are represented with an implicit function $x, t) : Q x [0, t) — > R. 
Boundaries between regions of interest are represented as level sets where $x, t) = C. 
Propagation of $ over time is defined by </> t = - (V^, f) for some vector field f that is a 
function of image data, of ^, and spatial derivatives of In this paper, t) > 0 denotes 
the interior of a segmented region and the (outward) normal vector N along a level set of ^ is 

ivr V4> 

^ |y0| . Assuming the gradient and normal of (/> exist, the general form of a level set 
PDE is 



0,= |V0|(N,f) = F|V0|. (l) 

In "variational level set methods" [38]-[40], the F in Eq-1 is constructed to regulate some 
quantity of the form 

S(t)=/ nfl (x,0,V0)dx. ( 2) 

As pointed out in [41], many image segmentation applications use an artificial time 
parameter t, which arises solely due to an iterative minimization of Eq- 2. In this paper, 
however, t corresponds to a physical time since the human user watches a time- varying 
visualization of t) to decide where and when to provide input. 

It is usually desirable to have \</> t \ > 0 only in a neighborhood of the moving zero level set 
[42]-[44]. This narrowban d restriction is used in the image processing community for 
several reasons. First, efficient numerical techniques such as the "sparse field method" [45] 
update (j) only within the narrow-band region. Dimensions of 3D medical image volumes are 
on the order of 512 3 ; real-time performance on desktop computers is attainable only by 
restricting computations to a subset of Q. Second, algorithms typically seek to separate Q 
into statistically different regions within the I(x) image. It is sufficient to know only sign(^) 
when labeling regions as interior and exterior; the magnitude of <f> has little meaning in 
segmentation applications. Finally, if </> t is nonzero for arbitrary values of 1^1, zero crossings 
can develop far from the initial $ = 0 level-set; a visualization of $ would show new 
"boundaries" spontaneously appearing. Users will be confused by such behavior; they 
expect to initialize 0) and watch a moving level set. 

Q is divided into exterior and interior regions by a regularized version of the Heaviside step 
function denoted by H e , illustrated in Fig-3. This regularized step function and its derivative 
8 e are defined as 
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1 if0>£ 
J? £ o(t)={ 0 if0<-£ (3) 

§ ( 1 + f + 1 sin ( ^ ) ) otherwise 



1/e if 0=0 

5 £ o 0= <( 0 if |0| (4) 

^(l+cos(^)) otherwise 

This paper considers systems described by narrowband level- set PDEs with the general form 

V0 



(f) t =S £ O 0 



G(J^o0, J)+AV 



|V0| 



,0(X,O) = 0 O (X). (5) 



Application- specific goals, such as minimization of a functional, dictate the choice of 
image-dependent G(-) in Eq-5; an example is given in the next section. A concise notation 

VF 

for the elliptic operator v ' | VF| * s ^F), wnere F is a smooth function with non-vanishing 
gradient defined on x E Q. 

In practice, the nonlinear PDEs encountered in level-set segmentation will tend to develop 
discontinuities. Periodic reinitialization of $ to a signed distance function [37], [46] 
mitigates these effects while preserving the $ = 0 level set. Re-distancing enforces the 
properties 



0<Pl< |V0| <p 2 , -KO<^(0)<^O, (6) 

where p\,P2, and kq are fixed constants for a given re-distancing method. These bounds are 
helpful for control synthesis in Section III. 

B. Segmentation as an Open-Loop System 

Automated image segmentation systems are designed under the assumption that a particular 
discriminative model captures distinguishing features that separate regions of interest from 
the rest of the image volume. Consequently, the systems often lead to erroneous 
segmentation results when their underlying assumptions do not hold. The term "open-loop" 
in the context of automatic segmentation means that the system evolves without any external 
input that might indicate failure of model assumptions and that the boundary is not moving 
towards a desired steady-state. Such systems arise from discriminative statistical models in 
the literature, wherein functionals are proposed that either maximize statistical differences 
between an object and its background or maximize similarity of the object to a template 
[40]. Several examples of statistical quantities used are region-based feature means [14], 
feature covariance [47], and ^-dimensional non-parametric density estimates [48], [49]. 

Nevertheless, a recurring problem is that many objects of interest do not coincide with 
minima of a proposed functional. As an example, consider segmentation via the "mean- 
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alignment" system [14]. The weighted means of image I(x) in the interior {</> > 0) and 
exterior < 0) regions are 

^ f Q [je £ o<l>]idx ^ j n [i-jr e o0]/dx 

A gradient flow for the functional 

E(t) = ^f Q je £ o(f>(I - /z in ) 2 +(l-^ £ o0)(J - / x out ) 2 dx+A/ fi |VJ^ £ o 0| dx (8) 
gives the following narrowband open-loop system: 

(f>t =^O0[-(/- / X in ) 2 + (/-/X O ut) 2 + A^(0)], 

0(x,O) =00 (x). U 

Fig4 and Fig-5 illustrate the behavior of open-loop system Eq-9 on a synthetic image that 
resembles a noisy image of the left and right ventricles in the brain. The desired 
segmentation boundary is drawn in dashed red, while the moving zero level set is solid 
green. Although the open-loop system correctly stops along most of the reference boundary, 
it creeps into an adjacent region of I(x) in the course of minimizing E(t). 

III. Feedback Augmentation of a Narrowband Levelset PDE 

A. Reference State and Input Structure 

Rather than having the human user give up on the PDE system and manually outline the 
desired region of interest, the PDE can be augmented with a user-driven control input. A 
control solution is sought due to limitations in the efficacy of open-loop system Eq- 5 for 
real images. Necessary human effort in segmentation can then be kept low; the user need not 
apply input in locations where the open-loop system keeps $ in agreement with a desired 
segmentation. 

Let yr denote the ideal reference segmentation. A human user could manually trace the level 
set yr{x) = 0 if given unlimited time. We seek to drive <p towards an explicit estimate of yr, 
while maintaining closed-loop stability and minimizing burden placed on the user. User- 
driven control effort should preserve the advantages of the narrowband formulation noted in 
Section II- A; therefore, an admissible control signal fix, t) will act only on the 1^1 < e 
subdomain of Q. The closed-loop system then becomes 

<t> t =8 £ o 0[G(0 5 /)+Aac(0)+/(x, t)]. (10) 

In this section, y^x) is assumed known; a control is synthesized to drive $ such that it 
matches yr. Later in Section IV, a coupled system that estimates yr from available user input 
is formulated. 



IEEE Trans Med Imaging. Author manuscript; available in PMC 2014 May 01. 



Karasev et al. 



Page 7 



B. Existence of a Regulatory Control 

Define the pointwise and total labeling error as £ and P, respectively: 



e(x,t)±(j*w-jr e o^), ®(t)±^f^ 2 d*. (id 

If yr is known and available, regulation of V(t) is straightforward with the following two 
theorems in this section. The regulatory control uses known bounds on the image-dependent 
G term of Eq-10; define Gm(x), Gm to be upper bounds such that for any segmentation state 



|G(0(x),/(x))| <G M (x)<G M . (12) 

Theorem III.l Using a spatially- varying U(x), a control for Eq-10 that stabilizes functional 
Eq-11 is 

/(x, t)= - £*7 2 +A(£) [k(S 2 o 0 • 0 - «(0)]. (13) 
Given constants A.q > 0, \\ > 0, /? G (0, 1), a sufficient condition for boundedness of V is 

(G fx) 

A(t)=Ao+Ai^, and | C/(x)| > [^J 1 ^ 1 ) ~ U m- ( 14 ) 
Furthermore, when the error £is large in the sense of 

p!n%°4>\Z\te<In s l°<l>Z*te> ( 15 ) 
the rate of convergence is bounded by 



j®<-f^ 2 £ o<t>edK<o. (i6) 

Proof: Re-arranging terms in T>' and integrating by parts making use of 8 e ° </> = 0 on 30: 



j t ®=fn&dx=J Q Z6 e o<l>-<t> t dx (17) 



--In 5 e ° flW, 0) - {iUf]dx+X! n 5 2 e o 0£V • [vlo^l ^ (18) 
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=JnS ° flW, 0) - (^)l dx - A/ n V(£ o • (19) 



=/jA 2 ° 4>iG(I,4>)dx - f a 5* o iM$U)'dx - A/ n |V(«* o #)|dz. (20) 
The Poincare inequality in L 1 guarantees the existence of a constant r such that 



In \ S s ° 0? ^ < rf Q V(<J* o 0£) dx, (21) 



-/n | v (^ 2 ° < -^/n*l ° ^ If I cfa, (22) 

where r is at most half the diameter of Q [50]. Substituting Eq- 22 into Eq- 19 bounds the 
Lyapunov functional' s time derivative: 

j t 9 < J Q S 2 o <t>[£G{1, 4>) - mf\dx - £ J n # o <$> \i\ dx. (23) 
The case of l£l being large relative to p in an integral sense (Eq-15) also implies 



j Q 8 2 o <(>\ZG\dx < o </> |f| M G M dx. ( 24) 
Substituting the condition on lt/(x)l magnitude from Theorem III. 1 gives 

and the error rate V' is negative semidefinite with a bound 



(25) 



d 

-j9 < ~fn 6 e ° 0C 2 ^ < (26) 

Boundedness of P is established after substituting the A, proposed in Theorem III. 1 : 



< J a S 2 o # Gdx - o 0 |f | dx (27) 



< G M J Q S 2 S ° 0 | f | dx - ±(\o+\i9)f a 6t o 4> |e| dx (28) 
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< (G M - (Xo+X^Mf^ o 0 |e| dx. (29) 

V' > 0 is only possible for P < (VGm - A,q) with ^ - 0 otherwise. Thus, V is bounded. 

Remark A near-optimal (i.e. low) value for r can be obtained by substituting the definitions 

of 8^, H s (eqns. 4,3) into | V(^ o 0 £)|, applying the chain rule, and comparing to o 0£| 
(omitted for space). In practice, r can be directly estimated via numerical evaluation of the 
integrals in Eq- 21 and is on the order of the IV^I due to re-distancing. 

IV. Auxiliary System Design 

In the previous section, a stabilizing controller was developed that drives the segmentation 
towards a reference state relying on fixed quantities yr and U, which are not known in 
practice. The user will be employed to provide the missing information by occasionally 
applying discrete corrections to the segmentation; these corrections are accumulated over 
time. From the current segmentation and user input, an estimate of the ideal yr must be 
inferred. These considerations lead to the coupled dynamical system presented in this 
section. A method for processing discrete input from a mouse or stylus to generate a 
distributed U is proposed in Section IV-A, while the accumulation of input is addressed in 
Section IV-B. An observer-like system is formulated in Section IV-C to compute yr, the 
explicit estimate of ideal state yr. 

A. User Input Processing 

Raw input from the user arrives in the form of binary decisions as to whether a given 
location in space is correctly labeled as inside or outside the segmentation boundary. The 
user clicks with a mouse or stylus at discrete points in Q and time, as illustrated in Fig-6. 
Define fy, k G N to be the sequence of times at which the user sees a visualization of $ and 
has an opportunity to apply input. At time they look at the labeling of fox^ t^) and either 
(a) apply a signed impulse denoting a "vote" for setting the label there or (b) do nothing 
because they agree with the current labeling of fox^ t^). Denote these sequential actions as 

( +1 if ^(x fc )>O>0(x /c ,t fc ), 
u k =l -1 if ^(x fc )<O<0(x fc ,tfc), (30) 
I 0 otherwise. 

Before these inputs can be accumulated into U, they must be mapped into the space-time 
domain with some fixed support Define the function /z(x, t) as 

M x > *) - 5j*fcfco(x - x fc )<$(t - (31) 

k 

where /zq(-) is a weight function and 6{t - fy) is the Dirac delta As noted in [21] using an 
image-dependent metric for /zq(-) is a useful way to weight spatial distances The examples in 
this paper use 
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a- 



•2+|7(x)-/(x fc )| 




> (32) 



which incorporates both Euclidean distance from x to x^ and similarity between image 
values at I(x) and /(x#). 



The label error impulse inputs accumulate over time to define the control U. However U 
must be regulated to prevent excessive input magnitudes while ensuring spatial smoothness 
and enabling \U\ > Um to satisfy the conditions of Section III-B An undesirable excess U 
and IV U\ can occur in U t = h(x, t) because the human user causes h(x, t) without 
understanding how their "vote" inputs influence the segmentation dynamics Furthermore 
when the label-error is shrinking at a consistent rate but over a large area it is expected that 
the human user will be impatient and apply excess input magnitude in an attempt to speed 
up the moving <j> = 0 boundary. 

Regulation of U is achieved using a nonlinear diffusion process together with accumulation 
of h(x, t): 



Changes in U are dominated by h(x, t) for \ U\ « Um- As I U\ grows, the diffusion coefficient 
H s 0 ((UIUy[) 2 -\) gains influence. The following example illustrates qualitative behavior of 
the U system. 

Example Consider the two-dimensional image slice shown in Fig-7a. A simulated "user" 
chooses locations x^ at which an update h(x, t) is applied according to Eq-31. Blue 'x' and 
red '+' denote places where h(x, t) is negative and positive, respectively. Fig-7b shows what 
U would look like without nonlinear diffusion. Fig-7c shows the response of the regulated 
[/-system from Eq- 33. Comparing 7b and 7c, it is clear that the latter is smoother and 
satisfies U < Uy[- 



C. Label-Error Estimate Dynamics 

Dynamically estimating the reference state necessitates a coupled system; <j> and the estimate 
of y evolve simultaneously Let ykx, t) be an estimate for yrtx) and define the error terms 



B. Accumulation of User Input 



=fc( Xjt )+v • [je e o ((u/uj 2 - i)w], e/(x,o)=o. 03) 



£ = J$? £ o 0 — J$? s o ^, e v = 



Jtfe o i/; - 3tf £ o U. (34) 



Feedback in the (/> system (Eq-10) will now use ^, 
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& =^o0[g(0,/)+Ak(0)+/(x,^|)] (35) 

0(X,O) =0o, 

where the initial <f>§ is specified by the user The auxiliary y(x, t) observer-like system is 
driven by accumulated user input U together with error terms £ and ejj\ 



$(x,0) =0(x,O). 

Total labeling error is defined by the following functionals of £ and ejj, where a is a 
constant parameter: 

observer vs. user input :&(t) = f n ]-(aU) 2 e^dx y (37) 



1 -2 

observer vs. visualized state:^(t) = 

In addition to stabilizing T + v, the control proposed in Theorem IV. 1 is designed to achieve 
a useful qualitative behavior. When the user is satisfied with the agreement between ^(x, t) 
and their ideal y{x), it is assumed that U(x) remains constant; either the user never needed 
to apply a correction near x or has otherwise stopped adding more inputs. In this case, yr 
should follow <f>. Conversely, when U(x) grows due to persistent human input, yris to 
become increasingly driven towards U irrespective of agreement between yr and <f>. 
Subsequently, yr should pull $ along due to the coupling term -£U 2 (Eq. 13) in the closed- 
loop dynamics of (/> . 

Theorem IV.l Let g(£ U, ejj) = -eu(aXJ) 2 and consequently 



^ t =S £ o^[i- eu (aU) 2 ]. 



(39) 



Assume that user input has stopped (U remains constant) and Theorem III. 1 is satisfied. 
Then, the sum V(t) — V + T has a negative semidefinite time derivative: 

V\t) < -f Q S 2 e o<p?dx - f Q Sf o $ [{aU) 2 e v - if dx. (40) 

Proof: 

Computing the time derivative V'(t) = V> + T' , 

^=In(^) 2 e u e u d^=J n (aU) 2 e u (S £ o $ • $ t )dx (41) 
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@'=fj£dx=f n £(5 £ o c/> • c/> t - S £ o ^ • &)dx. (42) 

Substituting for ^ and 

- Jn*2 o iP(aU) 2 [e 2 v a 2 U 2 - ej] dx m 

®'=Sn 6 h<t> [£G(<P, I) - fu 2 +\i K (S 2 o 0 • £)] dx-f n 5 2 oi, - eJ(aU) 2 ] dx. (44) 

After adding Eq-43 to Eq-44 and combining the 5 2 o terms, the portion containing error ejj 
can be conveniently factored: 

&'+@'=f Q 5 2 o4> [iG(<p,I) - fU 2 +\i K (6 2 o <(> ■ £)] d X -f Q 6 2 £ oTp [e 2 v a 4 U 4 - 2(aU) 2 eJ+e] dx 

=Jn S h4> [iG(<f>, I) - eU 2 +\£ K (6 2 o <t> ■ I)] dK-J Q 6 2 oi, [e v (all) 2 - |] 'dx. (46) 
When U and A. satisfy Theorem III. 1 , it follows that 

v'=&'+9i < -J n S 2 £ o 0 • f 2 +<S* o^{aU) 2 e u - ^dx < 0. (47) 

Thus, V'(t) is negative semidefinite and Vif) bounded. 

D. Synthetic Image Example 

To demonstrate the coupled dynamics, this section considers a simple segmentation 
scenario. Fig- 8 illustrates closed-loop system behavior on the synthetic image used 
previously in Section II-B. In the absence of user input, $x, t) behaves like the open-loop 
system; Fig-8a shows the ^estimate following <j> until they both reach steady state. With 
user input, the estimated ideal contour mediates between user input and the open-loop 
segmentation dynamics. In Fig- 8b, the user starts to apply input upon noticing the (/> = 0 
boundary creeping through the bridge between the two ellipsoidal regions. Input stops and 
the system reaches steady- state after the user is satisfied with the displayed segmentation. 
Comparing Fig-8a and Fig-8b, we see that regardless of user input, the closed-loop sytem 
aligns the zero level-sets of (/> and yat steady-state; the presence of user input in Fig-8b 
shifts the steady-state of <j> and yr. In both cases the a for Eq-39 is set to l/Uy[. 

V. Application to MRI and CT Images 

In this section, the feedback- augmented level-set methods are applied to two specific 
problems involving interactive medical image segmentation of X-ray Computed 
Tomography (CT) and Magnetic Resonance Imaging (MRI) volumes. In Section V-A, a 
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fractured piece of the femur is segmented in a CT volume. Next, the technique is applied to 
extract a patellar tendon in an MRI volume in Section V-B. For both applications, we first 
review the clinical problem. Next, an open-loop system appropriate for the image type is 
chosen. Finally, the closed-loop <f> system is summarized, followed by a discussion of the 
segmentation results. 

A. CT Segmentation with Mean-Alignment 

The realignment of bone fragments after a fracture, also referred to as fracture reduction, is 
a crucial task during the operative treatment of complex bone fractures. Anatomically 
incorrect fracture reduction can result in severe post- traumatic complications. In order to 
avoid such problems and obtain an optimal fit between all relevant fracture fragments, the 
surgeon traditionally exposes the fractured bone by cutting the soft tissue envelope to access 
the fragments directly. Subsequent realignment of the recovered fracture fragments requires 
a trial and error approach, which prolongs surgery and increases the risk of complications 
for the patient. Therefore, there is a clear need for the development of less invasive 
techniques to reconstruct complex fractures. Segmentation of the image data to localize the 
fragments (as in Fig- 10) is a key step first step toward computing and planning the optimal 
way of realigning the fractured bone. 

Bone tissue generally appears very bright in CT imagery; therefore, the segmentation of 
bone in CT is modeled with the mean-alignment system Eq-9. Using the control from Eq-13 
leads to the closed-loop system 

</H=6 £ o 0 [-(/ - // in ) 2 + (I - //out) 2 - £U 2 +\(t)K(6 2 £ o 0 • D] . (48) 

Note that a healthy bone can often be segmented in its entirety using the open-loop system 
alone, since the zero level-set of ^is naturally drawn to boundaries of bright objects. 
Interactive control becomes vital, however, when segmenting bone subject to disease or 
injury; accurate segmentation is not possible without feedback. 

Fig- 11 illustrates several aspects of the interactive system applied to the segmentation of a 
large bone fragment. Local maxima of U(x) are shown as markers along with the 
intermediate ^=0 boundary in Fig- 11 (a)-(d); the green semi-opaque surface represents the 
user's reference y = 0 level- set. Regions of bright CT image values are quickly segmented 
by the open-loop system; the user generates some input along what appears to be an edge of 
the fractured bone where I(x) is darker (Fig- 1 la). After the system has segmented this first 
edge and is nearly at steady-state, the user finds another edge along which to apply input 
(Fig- 1 lb). A further refinement is made in Fig-1 lc that leads to the final steady-state 
segmentation of Fig-1 Id. The number of voxels actuated by the user's mouse strokes are 
plotted versus (scaled) time in Fig-1 le. In a fully manual segmentation, each of the 16404 
boundary voxels in Fig-1 Id would need to be marked by the user; the second y-axis on the 
right indicates the actuated voxels as a percentage of the fully manual effort. It is difficult 
for a person to accurately decide whether or not a fracture edge in a distant part of the 
volume is part of the same fragment Fig-1 le indicates that a substantial portion of time is 
spent with (/> near a steady-state while the user scrolls through slices to decide where fracture 
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edges are located and whether these edges are part of the same fragment or another one in 
close proximity In Fig- 10a two light regions are determined to be part of the fragment being 
segmented while a third (in the upper left) is a separate bone fragment. 

Normalized histograms of the image intensity distribution inside of the segmentation 
boundary at steady-state are shown in Fig- 1 If Without feedback the segmentation encloses a 
region with a highly peaked I(x) histogram In contrast the closed-loop system reaches 
steady- state with a heavier- tailed intensity histogram The distribution shift is due to the user 
applying input to correctly label bone near and along the jagged fragment edges These 
regions are precisely where we care most about accurate segmentation since the fragment's 
edges are to be matched with those of other fragments during the fracture reduction task. 

B. MRI Segmentation with Localized Statistics 

Surgical repair of a torn anterior cruciate ligament (ACL) requires choosing a location from 
which to harvest a graft of sufficient length and thickness. The most common choice today 
is the patellar tendon (PaT). While the width and thickness of a PaT are quite predictable 
based on patient height and weight, the tendon length varies widely. This variability in shape 
continues to complicate surgery due to mismatch between the graft and drilled tunnel, 
especially in "anatomic reconstruction" [51]— [53] where the replacement ACL is to be 
oriented exactly as before the injury 2 . Quantifying the variability of PaT shape and 
comparing it to other graft choices (namely the hamstring and quadriceps tendons) requires 
accurate segmentation in MRI volumes. 

Soft tissue including tendons and ligaments is readily visible in MRI, unlike CT where only 
mineral-dense bone gives a strong response. However, images obtained by MRI have a 
complicated mapping between tissue type and observed intensity; segmenting soft tissue in 
MRI is generally more difficult than bone in CT. The distribution of intensity values in MRI 
arising from a particular anatomic structure will vary significantly between slices (Fig- 12b), 
and will also overlap the distributions of other structures (Fig- 12a). An effective approach 
for MRI segmentation is to separate regions based on spatially- varying statistics of I(x). To 
do so, open-loop dynamics are chosen to use the localizing active contours of [54] that 
define intensity means jiii n (x) and aw( x ) locally as integrals over a Euclidean ball of radius 
r. With feedback the system becomes 

<fH=6 £ o 0 /) - £u 2 +\(t)K(5 2 £ o cf> • f)] . (50) 

Despite the advantages of the underlying open-loop system segmenting a PaT remains 
challenging for two reasons First, the tendon is very thin relative to its height and width 
making a satisfactory choice of r in Eq-49 difficult Second, I(x) at the insertion points of the 



as opposed to the more common approach of attaching a shorter graft to an arbitrary reachable point on the bone surface. 
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tendon has the same distribution locally as adjacent connective tissue The human user 
however employs their anatomic knowledge to enable successful segmentation via the 
closed-loop system Fig- 13 shows the final result; the tendon has been segmented between its 
attachment on the inferior pole of the patella to the end of its insertion on the tibial tubercle 
For context the patella bone is also segmented and displayed. 

Incremental progress during interactive segmentation of the tendon is shown in Fig- 14 (a)- 
(d) Red and blue markers denote positive and negative extrema of U, respectively while the 
green semi-opaque surface represents a reference segmentation known to the human expert 
As the (j) = 0 boundary evolves after initialization a small amount of input yields the 
segmentation in Fig- 14b With the bulk of the tendon outlined the user applies input to fill 
gaps in the vertical edges and remove the over- segmented regions around the insertion 
points at the patella and tibia bones (Fig- 14c) Unlike the fracture scenario in Fig-1 1 the 
open-loop system applied to segment this tendon leads to massive "bleed- through" of the 
segmentation because the image distribution around the tendon insertion points is identical 
to that of the tendon itself Fig-1 4d shows the steady-state reached by the closed-loop 
system; user input stabilizes the segmentation at the desired reference boundary and 
prevents bleed-through past the insertion points on the patella and tibia. In Fig-14e, the 
number of voxels actuated by the user's mouse strokes is 4.7% of what would be needed to 
trace all of the tendon's boundary voxels manually. Comparing Fig-1 le and Fig-14e, the 
latter has more piecewise constant regions because the human user spends substantial time 
looking for anatomic markers and adjusting displayed image contrast to decide where the 
tendon begins and ends. 

VI. Comparison to Related Methods 

A. Overview 

In many implementations of level-set segmentation (e.g. [11], [17], [20]), the smoothing 
factor X is a parameter that is set by a user. However, understanding such a parameter 
requires users to have more mathematics background than is typical for the medical 
community. Here, we set X automatically to achieve desired behavior. The PDE control 
formulation here has a constant computational cost with respect to amount of user input and 
no abrupt changes to the segmentation, unlike in [21]-[23]. Under the proposed controller, 
sufficient input U(x) from the user guarantees agreement with the reference state; relaxed 
constraints in [22] dictate that it may be impossible for the segmentation to respect the user's 
inputs. Rushed use of the mouse by a human is not possible in [21], [23] because constraints 
are exactly enforced. In contrast, the input processing used in the current work provides 
leeway for the user: a small I U\ will not dominate the open-loop dynamics. If needed, a large 
accumulation of I U\ is achieved by "scribbling" repeatedly in a region. 

It is emphasized that the closed-loop control formulation in this paper does not seek to 
replace existing curve evolution algorithms but rather to augment them. The control- 
theoretic approach enables a user to reach the desired segmentation at steady-state; running 
the level set evolution for a longer time will not cause the boundary to "bleed-through" or 
contract. 
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B. Quantitative Comparison with GrabCut 

Two orthopedic images are used as test data to quantitatively compare the method presented 
in this paper with the popular GrabCut algorithm [19]. The user's goal is to segment the 
epiphysis and physis of the femur in Fig- 15a and Fig- 15b, respectively. User input via 
mouse click- and-drag is implemented and measured identically for both algorithms. The 
GrabCut implementation used here is available as part of the OpenCV library [55]. A 
location through which the cursor was dragged is defined as an "actuated voxel;" the extents 
around the cursor that mark seed regions in GrabCut are not counted towards this total. 
Locations in the image whose assigned label changes between background and foreground 
are tracked over time and are referred to as "reclassified voxels." In both implementations, 
total segmentation time is primarily a function of how long it takes the user to evaluate the 
current state and apply more corrective mouse strokes. The total number of actuated voxels 
needed to complete the segmentation is a robust indicator of user effort. 

Actuated voxels after initialization are plotted in Fig- 16. At termination, all of the 
segmentations have greater than 98% overlap with a manually created reference. For the 
adult epiphysis image of Fig- 15a, the average final actuated voxel counts are 348 (GrabCut) 
and 118 (proposed). For the juvenile physis segmentation, the averages are 536 (GrabCut) 
and 141 (proposed). Segmenting the physis is more difficult with GrabCut due to the 
elongated shape, the nearly identically-looking fluid around the bone and the bimodal 
appearance of cortical bone above and spongy bone below the physis. A GrabCut iteration 
can change the segmentation dramatically; when this change is erroneous, significant 
corrective effort becomes required. In Fig- 16b, we see this manifested by the large increases 
in actuated voxels during the first few rounds of GrabCut user input. In contrast, the 
proposed algorithm provides rapid continuous visual feedback for the user; small corrections 
are made before a large error can develop. 

Predictability of how the segmentation changes in response to mouse strokes is a criterion 
for practical ease of use. Two scatterplots quantify the predictability in Fig- 17; dynamic 
response is characterized in terms of the number of reclassified voxels (Y -axis) and the 
number of newly actuated voxels (X-axis). Each mark corresponds to one iteration when 
new user input was applied. Linear regression lines are overlaid on the data. The two 
algorithms have a similar dynamic response in the epiphysis segmentation, shown in 
Fig- 17a; correlation coefficients are 0.70 (GrabCut) and 0.90 (proposed algorithm). Two 
issues become apparent in Fig- 17b for the juvenile physis scenario. First, the distribution of 
GrabCut data points is quite broad; correlation coefficients are 0.61 and 0.92 for GrabCut 
and the proposed algorithm, respectively. Second, many of the GrabCut data points are 
below the dashed green line, indicating a waste of user effort since there are more voxels 
actuated than reclassified. The dynamic response of GrabCut makes it hard for a user to 
predict how much change new mouse strokes will cause. 

VII. Conclusion 

This paper has presented a modeling approach that enables control-theoretic analysis and 
design for interactive medical image segmentation. Results shown for a synthetic image 
(Section IV-D) and real medical volumes (Section V) agree with theoretical expectations of 
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system performance. Section V illustrated two qualitatively different situations: (1) gradual 
expansion of the boundary to bound the entire femur fragment and (2) prevention of "bleed- 
through" or over- segmentation with the patellar tendon. In both situations, the user is able to 
drive the segmentation to a desired steady state and to do so with much less effort in terms 
of actuated voxels than manual segmentation. In summary, the PDE control formulation 
enables us to guarantee a user's ability to reach a reference segmentation state while also 
absolving them of the need to understand mathematical details or use precise mouse 
movement. 

Successful use of the closed-loop algorithm by medical students motivates several future 
extensions. If a single image contains several objects of interest, they would need to be 
extracted sequentially in the current framework. Such a sequential de-coupled approach does 
not address natural constraints of the geometry and involves re-editing common boundaries. 
A coupled formulation using an open-loop system of PDEs such as in [20] together with a 
vector of control inputs would prevent region overlap and reduce the user's effort when 
segmenting multiple regions. Informative visualization is vital for efficient interaction, since 
performance of any interactive segmentation method is limited by how quickly and 
accurately a user can infer the segmentation state [56]-[58]. An interesting extension to the 
theory would consider the feedback between visualization and the creation of user input; for 
example, it may be desirable to confine movement of the boundary to regions that are 
observable from the user's viewpoint in 3D. 
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Figure 1. 

Segmentation by minimizing a meaningful image-dependent functional is not sufficient when the desired anatomic boundary 
not actually a minimizer (left). An expert user would typically desire to make some corrections (right) that contradict the 

functional' s minimizer. 
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Figure 2. 

Block diagram of the proposed control formulation. Feedback compensates for deficiencies in automatic segmentation by 

exploiting the human expert's interpretation of complex imagery. 
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(a) 0(x) with 0 level set in red. 




(b) Regularized Heaviside function H e o 0(x). 



Figure 3. 

The interior of a segmented region satisfies <p(x) > 0 while its exterior has $x) < 0. 
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time (scaled) 



Figure 4. 

E(t) functional (Eq- 8) regulated by the open-loop system. Corresponding images shown in Fig-5. 
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Figure 5. 

Desired segmentation and { t) = 0} level-set overlayed on image I(x). A user would like to segment the "left ventricle" in 
this synthetic image that resembles an MRI scan of the brain. Evolving $ according to Eq-9 shrinks E(t) successfully (Fig4). 

However, the open-loop system fails to segment the desired region. 
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U (x, t) changes. 




C/(x, 0 = constant 



< t < 4 



Figure 6. 

After initialization, the inner loop of Fig-2 updates (/> and yr. Input from a human user applies impulses at times that 
accumulate as U(x, t). Between times fy, the inner loop changes steady-state in response to updated U(x, t). The user stops 

applying input when the visualization of t) is satisfactory. 
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(a) Simulated (b) Nominal Ut (c) Ut via b£q- 33 

Figure 7. 

Regulating the input-integration with nonlinear diffusion keeps U smooth and bounded. Diffusion occurs when inputs occur 

in excess of Uy[\ here, Um = 10. 
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(a) No user input: $ follows <£. 




(hi With user input: sufficient U drives <p towards il\ 



Figure 8. 

A synthetic image example. The human user seeks to segment only the left ellipsoid region; the open-loop system tends to creep 
through and segment the union of the two ellipsoids. 1^1 < c and I yA < c are denoted by solid-green and dashed-red contours, 

respectively. Dotted regions in (b) indicate \ U\ > 0. 
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Figure 9. 

Shown above are the values of V and T corresponding to the synthetic image example of Fig-8. Left: ^is driven only by the 
image-dependent term in the absence of user input. Right: T(t) rapidly grows when the user applies input. n(t) rapidly shrinks as 

^is drawn towards yr. At steady-state, v(t) < 0.5 and T(t) < 3. 
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(a) tfscr\ nxHJMT in|>ui. ib) V*to views t*l" lii»al segmentation . 



Figure 10. 

Segmentation of shattered hip bone fragments in a CT scan. The image volume is 156 x 162 x 229 voxels with a 0.7mm grid 

spacing. 
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Figure 11. 

In (a)-(d), regions of user input are shown as markers on the progressing segmentation (dark) overlayed on user's reference 
boundary (light). The segmentation in Fig- 11a is the steady state of the open-loop system. In Fig- lid, the segmentation agrees 
with the desired reference boundary due to the closed-loop system's incorporation of user input. 
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1.1.1 One slice, overlapping h x) histogram (b) One region, varying /(x) histogram 



Figure 12. 

Tissues within one MRI slice have overlapping intensity histograms while a single tissue across slices has a varying histogram. 
Separation of regions must consider the spatially- varying image statistics. 
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(a) Single-slice view, (b) Two views of final segmentation. 



Figure 13. 

A segmentation of the patella and patellar tendon in MRI, part of a study on graft selection for anterior cruciate ligament (ACL) 
repair. The image volume is512x512x 224 voxels with a 0.4mm grid spacing. 
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(a) I = 0.1 <b) t = 0.2 (c) t = 0.7 <d) t = 1.0 
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(c) L 5 scr actuated voxels over time. 



Figure 14. 

Regions of significant user input shown together with the changing segmentation. Locations where U>0 and U < 0 correspond 
to red and blue markers, respectively. The open-loop system's tendency towards "bleed-through" near the insertion points is 
handled in the closed-loop system by incorporating user input with negative U. 
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(a) adult epiphysis (spongy bone) 




(b) juvenile physis (growth plate) 



Figure 15. 

Two test images are used in a quantitative comparison of GrabCut and the proposed algorithm. Regions of interest are outlined 

above; several interactive segmentations of each region are performed. 
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(a) Actuated voxel count, adult epiphysis. (b) Actuated voxel count, juvenile physis. 



Figure 16. 

Comparison of actuated voxels over time, after initialization. The proposed algorithm has both a lower mean actuated count and 

tighter clustering across repeated segmentations. 
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(a) Dynamic response, adult epiphysis. (b) Dynamic response, juvenile physis. 



Figure 17. 

Comparison of dynamic response to user input; data points and linear fit lines are shown. Points below the dashed green line 
indicate wasted user effort since more additional voxels were actuated than reclassified. 
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